oosting, a C4.5

نویسنده

  • J. R. Quinlan
چکیده

Breiman’s bagging and Freund and Schapire’s boosting are recent methods for improving the predictive power of classifier learning systems. Both form a set of classifiers that are combined by voting, bagging by generating replicated bootstrap samples of the data, and boosting by adjusting the weights of training instances. This paper reports results of applying both techniques to a system that learns decision trees and testing on a representative collection of datasets. While both approaches substantially improve predictive accuracy, boosting shows the greater benefit. On the other hand, boosting also produces severe degradation on some datasets. A small change to the way that boosting combines the votes of learned classifiers reduces this downside and also leads to slightly better results on most of the datasets considered.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of liquefaction potential based on CPT results using C4.5 decision tree

The prediction of liquefaction potential of soil due to an earthquake is an essential task in Civil Engineering. The decision tree is a tree structure consisting of internal and terminal nodes which process the data to ultimately yield a classification. C4.5 is a known algorithm widely used to design decision trees. In this algorithm, a pruning process is carried out to solve the problem of the...

متن کامل

Comparison of Classifier Algorithms in the Identification of Polypharmacy and Factors Affecting it in the Elderly Patients

Introduction: Prescribing and consuming drugs more than necessary which is known as polypharmacy, is both waste of resources and harm to patients. Polypharmacy is especially important for elderly patients; therefore, the factors affecting it must be identified and analyzed properly. Method: In this retrospective study, first, several classifier algorithms, i.e., C4.5, SVM, KNN, MLP, and BN for ...

متن کامل

A Backward Adjusting Strategy and Optimization of the C4.5 Parameters to Improve C4.5's Performance

In machine learning, decision trees are employed extensively in solving classification problems. In order to design a decision tree classifier two main phases are employed. The first phase is to grow the tree using a set of data, called training data, quite often to its maximum size. The second phase is to prune the tree. The pruning phase produces a smaller tree with better generalization (sma...

متن کامل

Are Decision Trees Always Greener on the Open (Source) Side of the Fence?

This short paper compares the performance of three popular decision tree algorithms: C4.5, C5.0, and WEKA’s J48. These decision tree algorithms are all related in that C5.0 is an updated commercial version of C4.5 and J48 is an implementation of the C4.5 algorithm under the WEKA data mining platform. The purpose of this paper is to verify the explicit or implied performance claims for these alg...

متن کامل

Preservation of the Sample Data with Help of Unrealized Training Datasets and later classifying it Using Modified C4.5 Algorithm

In order to protect the data centrally when they are being transferred from one party to another party so, that it cannot be used for secondary purposes unrealized training dataset is an important technique used to prevent data. With help of Unrealized training dataset algorithm it divides the sample data in two forms i.e. Tp a set of perturbing datasets and T’ a set of output training datasets...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999